WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 5 : 
C12N 9/22, 15/55, 15/74 



Al 



(11) International Publication Number: WO 94/183J3 

(43) International Publication Date: 18 August 1994 (18.08.94) 



(21) International Appfication Number: PCTAJS94/01201 

(22) International Filing Date: 10 February 1994 (10.0254) 



(30) Priority Data: 

08/017,493 



12 February 1993 (12.0233) US 



(71) Appficant: THE JOHNS-HOPKINS UNIVERSITY [US/US]; 

34th & Charles Streets, Baltimore, MD 21218 (US). 

(72) Inventor: CHANDRASEGARAN, Srinivasan; 4 East 32nd 

Street, #206, Baltimore, MD 21218 (US). 

(74) Agents: KOKUIIS, Paul N. et al; Cushman, Darby & 
Cushman, 1 100 New York Avenue, N.W„ Washington, DC 
20005 (US). 



(81) Designated States AU, CA, JP, NZ, European patent (AT, BE, 
GH, DE, DK, ES, FR, GB, GR, IE, rT, LU, MC, NL, FT, 
SE). i. 



Published 

With international search report 
With amended claims. 



(54) Hue: FUNCTIONAL DOMAINS IN FLAVOBACTERIUM OKEANOKOITES (FORI) RESTRICTION ENDONUCLEASE 
(57) Abstract 

The present inventors have identified die recognition and cleavage domains of the Fold restriction endonuclease. Accordingly, 
the present invention relates to the DNA segments encoding die recognition and cleavage domains of the Fold restriction endonuclease, 
respectively. The 41 kDa N-terminal fragment constitutes the Fold recognition domain while me 25 kDa C-terminal fragment constitutes 
die Fold cleavage nuclease domain. The present invention also relates to hybrid restriction enzymes comprising the nuclease domain of 
die Fold restriction endonuclease linked to a recognition domain of another enzyme. Additionally, the present invention relates to me 
construction of two insertion mutants of Fold endonuclease. 







FOR THE PURPOSES OF INFORMATION ONLY 






Codes used to identify States party to the PCT on the front pages of pamphlets publishing international 


applications under the PCT. 










AT 


Austria 


GB 


United 10 ugdom 


MR 


Maurttania 


AU 


Australia 


GB 


Georgia 


MW 


Malawi. 


Bfi 


Barbados 


GN 


Guinea 


NE 


Niger 


BE 


Belgium 


GR 


Greece 


NL 




BF 


BUnDDS rmso 


HU 




NO 


Norway 


BG 


Bulgaria 


IE 


Ireland 


NZ 


New Zealand 


BJ 




IT 


Italy 


PL 


Poland 


BR 


Brazil 


JP 


Japan 


FT 


Portugal 


BY 




KE 


Kenya 


RO 


Romania 


CA 


Canada 


KG 


Kyrgystin 


RU 


Russian Pod oration 


CF 


Central African Repcblic 


KP 


Democratic People's Republic 


SD 


Sudan 


CG 


Congo 




of Korea 


SB 


Sweden 


CH 


SwilzeiUnd 


KR 


Republic of Korea 


SI 


Slovenia 


a 


Cote tflvoirc 


KZ 


Kazakhstan 


SK 


Slovakia 


CM 


Cameroon 


LI 




SN 


Senegal 


CN 


China 


LK 


SriUnka 


TD 


Chad 


CS 


Ctocfaoalovatia 


UJ 


Luxembourg 


TG 


Togo 


CZ 


f- , , |, p t_tl i 

C2SCO KgpOPnC 


LV 


Latvia 


XI 




DE 


Germany 


MC 


Monaco 


TT 


Trinidad and Tobago 


DK 

ES 




MD 


Republic of Moldova 


DA 


Ukraine 


Spam 


MG 




us 


United States of America 


FI 


Finland 
France 


ML 




uz 


Uzbekistan 


FR 


MN 


Mongolia 


VN 


Viet Nam 


GA 


Gabon 











WO 94/18313 PCJ/US94/01201 



FUNCTIONAL DOMAINS IH FLSVOBACTERIUM OKSANOKOJTSS 
(FORI) RESTRICTION BNDONUCLBASB 



BACKGROUND OF THE INVENTION 

1. Field of the Invention: 

The present invention relates to the Fofcl 
restriction endonuclease system. In particular, the 
present invention relates to DNA segments encoding 
the separate functional domains of this restriction 
endonuclease system. 

The present invention also relates to the 
construction of two insertion mutants of Fokl \, 
endonuclease. 

2. Background Information: 

Type II endonucleases and modification 
methylases are bacterial enzymes that recognize 
specific sequences in duplex DNA. The endonuclease 
cleaves the DNA while the methylases methylates 
adenine or cytosine residues so as to protect the 
host-genome against cleavage [Type II restriction 
and modification enzymes, m Nucleases (Eds. Modrich 
and Roberts) Cold Spring Harbor Laboratory, New 
York, pp. 109-154, 1982]. These restriction- 
modification (R-M) systems function to protect cells 
from infection by phage and plasmid molecules that 
would otherwise destroy them. 
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As many as 2500 restriction enzymes with 
over 200 specificities have been detected and 
purified (Wilson and Murray, Annu. Rev, ggpsfci 
25:585-627, 1991). The recognition sites of most of 
5 these enzymes are 4-6 base pairs long. The small 
size of the recognition sites is beneficial as the 
phage genomes. are usually small and these small 
recognition sites occur more frequently in the 
phage. 

10 Eighty different R-M systems belonging to 

the Type IIS class with over 35 specificities have 
been identified. This class is unique in that the 
cleavage site of the enzyme is separate from the 
recognition sequence. Usually the distance between 

15 the recognition site and the cleavage site is quite 
precise (Szybalski et al., Gene , 100:13-26, 1991). 
Among all these enzymes, the Fokl restriction 
endonuclease is the most well characterized member 
of the Type IIS class. The FoJcI endonuclease 

20 (RFokl) recognizes asymmetric pentanucleotides in 
double-stranded DNA, 5 1 6GATG-3 ■ (SEQ ID NO: 1) in 
one strand and 3 1 -CCTAC-5 1 (SEQ ID NO: 2) in the 
other, and introduces staggered cleavages at sites 
away from the recognition site (Sugisaki et al., 

25 Gene 16:73-78; 1981) . In contrast, the Fo*I 

methylase (MFokl) modifies DNA thereby rendering the 
DNA resistant to digestion by FoJcI endonuclease. 
The FoJcI restriction and modification genes have 
been cloned and their nucleotide sequences deduced 

30 (Kita et al., J. of Biol, Chem. . 264:575-5756, 
1989) • Nevertheless, the domain structure of the 
FoJcI restriction endonuclease remains unknown, 
although a three domain structure has been suggested 
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(Wilson and Murray, Annu. Rev, Genet. 25:585-627, 
1991) . 

SUMMARY OF THE INVENTION 

Accordingly, it is an object of the 
5 present invention to provide isolated domains of 
Type IIS restriction endonuclease. 

It is another object of the present 
invention to provide hybrid restriction enzymes 
which are useful for mapping and sequencing. 
10 An additional object of the present 

invention is to provide two insertion mutants of 
FOKI which have an increased distance of cleavage 
from the recognition site as compared to the wild- 
type enzyme. The polymerase chain reaction (PCR) is 
15 utilized to construct the two mutants. 

Various other objects and advantages of 
the present invention will become obvious from the 
drawings and the following description of the 
invention. 

20 In one embodiment, the present invention 

relates to a DNA segment encoding the recognition 
domain of a Type IIS endonuclease which contains the 
sequence-specific recognition activity of the Type 
IIS endonuclease or a DNA segment encoding the 

25 catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of the Type IIS 
endonuclease. 

In another embodiment, the present 
invention relates to an isolated protein consisting 

30 essentially of the N-terminus or recognition domain 
of the FoJcI restriction endonuclease which protein 
has the sequence-specific recognition activity of 
the endonuclease or an isolated protein consisting 



WO 94/18313 PCT/US94/01201 



4 

essentially of the C-terminus or catalytic domain of 
the Fokl restriction endonuclease which protein has 
the nuclease activity of the endonuclease. 

In a further embodiment, the present 
5 invention relates to a DNA construct comprising a 

first DNA segment encoding the catalytic domain of a 
Type IIS endonuclease which contains the cleavage 
activity of the Type IIS endonuclease; a second DNA 
segment encoding a sequence-specific recognition 

10 domain other than the recognition domain of the Type 
IIS endonuclease; and a vector. In the construct, 
the first DNA segment and the second DNA segment are 
operably linked to the vector to result in the 
production of a hybrid restriction enzyme. 

15 In another embodiment, the present 

invention relates to a hybrid restriction enzyme 
comprising the catalytic domain of a Type IIS 
endonuclease which contains the cleavage activity of 
the Type IIS endonuclease linked to a recognition 

20 domain of an enzyme or a protein other than the Type 
IIS endonuclease from which the cleavage domain is 
obtained. 

In a further embodiment, the present 
invention relates to a DNA construct comprising a 

25 first DNA segment encoding the catalytic domain of a 
Type IIS endonuclease which contains the cleavage 
activity of the Type IIS endonuclease/ a second DNA 
segment encoding a sequence-specific recognition 
domain other than the recognition domain of the Type 

30 IIS endonuclease; a third DNA segment comprising one 
or more codons, wherein the third DNA segment is 
inserted between the first DNA segment and the 
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second DNA segment; and a vector. Preferably, the 
third segment contains four or seven codons. 

In another embodiment, the present 
Invention relates to a procaryotic cell comprising a 
5 first DNA segment encoding the catalytic domain of a 
Type IIS endonuclease which contains the cleavage 
activity of the Type IIS endonuclease; a second DNA 
segment encoding a sequence-specific recognition 
domain other than the recognition domain of the Type 

10 IIS. endonuclease; a third DNA segment comprising one 
or more codons, wherein the third DNA segment is 
inserted between the first DNA segment and the 
second DNA segment; and a vector. The first DNA 
segment and the second DNA segment are operably 

15 linked to the vector so that a single protein is 
produced. 

BRIEF DESCR IPTION OF THE DRAWINGS 
FIGURE 1 shows sequences of the 5" and 3' 
primers used to introduce new translation signals 

20 into foklH and fokIR genes during PCR amplification. 
(SEQ ID NOs: 3-9). SD represents Shine-Da Igarno 
consensus RBS for Escherichia coli (E. coli) and 7- 
bp spacer separates the RBS from the ATG start 
condon. The fokIM primers are flanked by Ncol 

25 sites. The fokIR primers are flanked by BamHI 
sites. Start and stop codons are shown in bold 
letters. The 18 -bp complement sequence is 
complementary to the sequence immediately following 
the stop codon of Mfokl gene. 

30 FIGURE 2 shows the structure of plasmids 

pKCYCMfokIM, pKRSRfokIR and pCBfolcIR. The PCR- 
modified fokIM gene was inserted at the Ncol site of 
pACYC184 to form pACYCfoJfcJM. The PCR-generated 
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fokIR gene was inserted at the BamHX sites of pRRS 
and pCB to form pRRSJToJcIJ? and pCB/okZR, 
respectively. pRRS possesses a lac UV5 promoter and 
pCB contains a strong tac promoter. In addition, 
5 these vectors contain the positive retroregulator 
sequence downstream of the inserted fokXR gene. 

FIGURE 3 shows SDS (0.1%) - polyacrylamide 
(12%) gel electrophoretic profiles at each step in 
the purification of Fokl endonuclease. Lanes: i, 

10 protein standards; 2, crude extract from uninduced 

cells; 3, crude extract from cells induced with l- AN 
IPTG; 4, phosphocellulose pool; 5, 50-70% (NHO2SO4 
fractionation pool; and 6, DEAE pool. 

FIGURE 4 shows SDS (0.1%) - polyacrylamide 

15 (12%) gel electrophoretic profiles of tryptic 
fragments at various time points of trypsin 
digestion of Fokl endonuclease in presence of the 
oligonucleotide DNA substrate, d-5 1 -CCTCTGGATGCTCTC- 
3 V (SEQ ID NO: 10): 5 9 -GAGAGCATCCAGAGG-3 1 (SEQ ID 

20 NO: 11). Lanes: 1, protein standards; 2, Fokl 

endonuclease; 3, 2.5 min; 4, 5 min; 5, 10 min; 6, 20 
min; 7, 40 min; 8, 80 min; 9, 160 min of trypsin 
digestion respectively. Lanes 10-13: HPLC purified 
tryptic fragments. Lanes: 10, 41 kDa fragment; .11, 

25 30 kDa fragment; 12, 11 kDa fragment; and 13, 25 kDa 
fragment. 

FIGURE 5 shows the identification of DNA 
binding tryptic fragments of Fokl endonuclease using 
an oligo dT-cellulose column. Lanes: 1, protein 
30 standards, 2, Fokl endonuclease; 3, 10 min trypsin 
digestion mixture of Fokl - oligo complex; 4, 
tryptic fragments that bound to the oligo dT- 
cellulose column; 5, 160 min trypsin digestion 
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mixture of Fokl - oligo complex; 6, tryptic 
fragments that bound to the oligo dT-cellulose 
column. 

FIGURE 6 shows an analysis of the cleavage 
5 properties of the tryptic fragments of Fokl 
endonuclease. 

(A) The cleavage properties of the 
tryptic fragments were analyzed by agarose gel 
electrophoresis. 1 fig of pTZ19R in lOmM Tris.KCl 

10 (pH 8), 50mM NaCl, ImM DTT r and lOmM MgCl 2 was 

digested with 2 pi of the solution containing the 
fragments (tryptic digests, breakthrough and eluate 
respectively) at 37 °C for 1 hr in a reaction volume 
of 10 Ml. Lanes 4 to 6 correspond to trypsin 

15 digestion of Fok J- oligo complex in absence of 

MgCl 2 . Lanes 7 to 9 correspond to trypsin digestion 
of Fokl - oligo complex in presence of 10 mM MgCl 2 . 
Lanes: 1, 1 kb ladder; 2, pTZ19R; 3, pTZ19R 
digested with Fokl endonuclease; 4 and 6, reaction 

20 mixture of the tryptic digests of Fokl - oligo 

complex; 5 and 7, 25 kDa C- terminal fragment in the 
breakthrough volume; 6 and 9, tryptic fragments of 
Fokl that bound to the DEAE column. The intense 
bands at bottom of the gel correspond to excess 

25 oligonucleotides. 

(B) SDS (0.1%) - polyacrylamide (12%) gel 
electrophoretic profiles of fragments from the DEAE 
column. Lanes 3 to 5 correspond to trypsin 
digestion of Fokl * oligo complex in absence of 

30 MgCl 2 . Lanes 6 to 8 correspond to trypsin digestion 
of Fokl - oligo complex in presence of 10 mM MgCl 2 . 
Lanes: 1, protein standards; 2 , Fokl endonuclease; 
3 and 6, reaction mixture of the tryptic digests of 
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Fokl - oligo complex; 4 and 7, 25 kDa C-terminal 
fragment in the breakthrough volume; 5 and 8, 
tryptic fragments of Fokl that bound to the DEAE 
column. 

5 FIGURE 7 shows an analysis of sequence - 

specific binding of DNA by 41 kDa N-terminal 
fragment using gel mobility shift assays. For the 
exchange reaction, the complex (10 Ml) was incubated 
with 1 fil of 32 P-labeled specific (or non-specific) 

10 oligonucleotide duplex in a volume of 20 Ml 

containing 10 mH Tris.HCl, 50 mH NaCl and 10 mM MgCl 2 
at 37°C for various times. 1 m! of the 5»- 32 P- 
labeled specific probe [d-5 • -CCTCTGGATGCTCTC-3 • (SEQ 
ID NO: 10): 5 " -GAGAGCATCCAGAGG-3 ' (SEQ ID NO: 11) ] 

15 contained 12 picomoles of the duplex and - 50 x 10 3 
cpm. 1m1 of the 5 , - 32 P-labeled non-specific probe 
[5 , •TAATTGATTCTTAA-3 , (SEQ ID NO: 12) :5'-.. 
ATTAAGAATCAATT-3 1 (SEQ ID NO: 13)] contained 12 
picomoles of the duplex and - 25 x 10 3 cpm. (A) 

20 Lanes: 1, specific oligonucleotide duplex; 2, 41 
kDa N-terminal fragment-oligo complex; 3 and 4 f 
specific probe incubated with the complex for 30 and 
120 min respectively. (B) Lanes: 1, non-specific 
oligonucleotide duplex; 2 r 41 kDa N-terminal 

25 fragment-oligo complex; 3 and 4 non-specific probe 
incubated with the complex for 30 and 120 min 
respectively. 

FIGURE 8 shows SDS (0.1%) polyacrylamide 
(12%) gel electrophoretic profiles of tryptic 

30 fragments at various time points of trypsin 

digestion of FoJcI endonuclease. The enzyme (200 /tg) 
in a final volume of 200 fil containing 10 mH 
Tris.HCl, 50 mM NaCl and lOmH HgCl 2 was digested with 
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trypsin at RT. The trypsin to FoJcI ratio was 1:50 
by weight. Aliquots (28 til) from the reaction 
mixture removed at different time intervals . and * 
quenched with excess antipain. Lanes: l, protein * 
5 standards; 2, Fokl endonuclease; 3, 2.5 min; 4, 5.0 
min; 5, 10 min; 6, 20 min; 7, 40 min; 8, 80 min; and 
9,160 min of trypsin digestion respectively. 

FIGURE 9 shows the tryptic map of Fokl 
endonuclease (A) Fokl endonuclease fragmentation 

10 pattern in absence of the oligonucleotide substrate. 

(B) Fo*I endonuclease fragmentation pattern in , 
presence of the oligonucleotide substrate. 

FIGURE 10 shows the predicted secondary 
structure of FoJcI based on its primary sequencing 

15 using the PREDICT program. (See SEQ ID NO:31) The 
trypsin cleavage site of Fokl in the presence of DNA 
substrates is indicated by the arrow. The 
KSELEEKKSEL segment is highlighted. The symbols are 
as follows: h, helix; s, sheet; and # , random coil. 

20 FIGURE 11 shows the sequences of the 5 1 

and 3 • oligonucleotide primers used to construct the 
insertion mutants of Fokl (see SEQ ID NO: 32, SEQ ID 
NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ 
ID NO: 37, SEQ ID NO: 38 and SEQ ID NO: 39, 

25 respectively). The four and ss2£Sn codon inserts are 
shown in bold letters. The amino acid sequence is 
indicated over the nucleotide sequence. The same 3 1 
primer was used in the PCR amplif ication of both 
insertion mutants. 

30 FIGURE 12 shows the SDS/PAGE profiles of 

the mutant enzymes purified to homogeneity. Lanes: 
1, protein standards; 2, Fokl; 3, mutant Fold with 
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4-codon insertion; and 4, mutant Fokl with 7-codori 
insertion. 

FIGURE 13 shows an analysis of the DNA 
sequence specificity of the mutant enzymes* The DNA 
5 substrates were digested in 10 mM Tris HCl, pH 

8.0/50 mM NaCl/1 mM DTT/lOmM MgCl 2 at 37°C for 2 
hrs. 

(A) Cleavage pattern of pTZ19R DNA 
substrate analyzed by 1% agarose gel 

10 electrophoresis. 2jig of pTZ19R DNA was used in each 
reaction. Lanes: 1, 1-kilobase (kb) ladder; 2, 
pT219R; 3, pTZ19R digested with Fokl; pTZ19R 
digested with mutant Fo*I with 4-codon insertion; 
and 5 r pTZ19R digested with mutant Fokl with 7-codon 

15 insertion. 

(B) Cleavage pattern of 256 bp DNA, 
substrate containing a single Fokl site analyzed by 
1.5% agarose gel electrophoresis, lpg of 
radiolabeled substrates (^P-labeled on individual 

20 strands) was digested as described above. The 

agarose gel was stained with ethidium bromide and 
visualized under UV light. Lanes 2 to 6 correspond 
to the ^P-labeled substrate in which the S'-CATCC-S 1 
strand is S8 ~* labeled. Lanes 7 to 11 correspond to 

25 the substrate in which the 5 1 -GGATG-3 • strand is ^P- 
labeled. Lanes: 1, lkb ladder; 2 and 7, 3*P-labeled 
250 bp DNA substrates; 3 and 8, K -P labeled 
substrates cleaved with Fokl; 4 and 9, purified the 
laboratory wild-type Fokl; 5 and 10, mutant Fokl 

30 with 4-codon insertion; 6 and 11, mutant FoJcI with 
7-codon insertion. 

(C) Autoradiograph of the agarose gel 
from above. Lanes: 2 to 11, same as in B. 
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FIGURE 14 shows an analysis of the 
distance of cleavage from the recognition site by 
Fokl and the mutant enzymes. The unphosphorylated 
oligonucleotides were used for dideoxy ONA 
5 sequencing with pTZ19R as the template. The 
sequencing products (G, A, T, C) were 
electrophoreses on a 6% acrylamide gel containing 7M 
urea, and the gel dried. The products were then 
exposed to an x-ray film for 2 hrs. Cleavage 
10 products from the 100 bp and the 256 bp DNA 

substrates are shown in A and B, respectively. I 
corresponds to substrates containing 32 P-label on the 
5'-GGATG-3' strand, and II corresponds to substrates 
containing 32P- label on the S^CATCC^ 1 strand. 
15 Lanes: 1, Fofcl; 2, Fokl ; 3, mutant Fo*I with 4- 
codon insertion; and 4, mutant FoJcI with 7-codon 
insertion. 

FIGURE 15 shows a map of the cleavage 
site(s) of Fokl and the mutant enzymes based on the 

20 100 bp DNA substrate containing a single Fokl site: 
(A) wild-type Fo*I; (B) mutant FoJcI with 4-codon 
insertion; and (C) mutant FoJcI with 7-codon 
insertion (see SEQ ID N0:40). The sites of cleavage 
are indicated by the arrows. Major cleavage sites 

25 are shown by larger arrows. 

DETAILED DESCRIPTION OF THE INVENTION 
The present invention is based on the 
identification and characterization of the 
functional domains of the FoicI restriction 

30 endonuc lease. In the experiments resulting in the 
present invention, it was discovered that the Fokl 
restriction endonuclease is a two domain system, one 
domain of which possesses the sequence-specific 
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recognition activity while the other domain contains 
the nuclease cleavage activity. 

The Fokl restriction endonuclease 
recognizes the non-pa lindromic pentanucleotide 5»- 
5 GGATG-3 9 (SEQ ID NO: 1) : 5 1 -CATCC-3 • (SEQ ID NO: 2) in 
duplex DNA and cleaves 9/13 nucleotides downstream 
from the recognition site. Since 10 base pairs are 
required for one turn of the DNA helix, the present 
inventors hypothesized that the enzyme would 

10 interact with one face of the DNA by binding at one 
point and cleave at another point on the next turn 
of the helix. This suggested the presence of two 
separate protein domains, one for sequence-specific 
recognition of DNA and one for endonuclease 

15 activity. The hypothesized two domain structure was 
shown to be the correct structure of the Fokl 
endonuclease system by studies that resulted in the 
present invention. 

Accordingly, in one embodiment, the 

20 present invention relates to a DNA segment which 
encodes the N-terminus of the Fokl restriction 
endonuclease (preferably, about the N- terminal 2/3*8 
of the protein). This DNA segment encodes a protein 
which has the sequence-specific recognition activity 

25 of the endonuclease, that is, the encoded protein 

recognizes the non-palindromic pentanucleotide d-5 f - 
GGATG-3 • (SEQ ID NO: 1) : 5 1 -CATCC-3 1 (SEQ ID NO: 2) in 
duplex DNA. Preferably, the DNA segment of the 
present invention encodes amino acids 1-382 of the 

30 Fokl endonuclease. 

In a further embodiment, the present 
invention relates to a DNA segment which encodes the 
C-terminus of the Fokl restriction endonuclease. 
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The protein encoded by this DNA segment of the 
present invention has the nuclease cleavage activity 
of the Fokl restriction endonuclease. Preferably , 
the DNA segment of the present invention encodes 
5 amino acids 383-578 of the FoJcI endonuclease. DNA 
segments of the present invention can be readily 
isolated from a biological samples using methods 
known in the art, for example, gel electrophoresis, 
affinity chromatography, polymerase chain reaction 

10 (PGR) or a combination thereof. Further, the DNA 

segments of the present invention can be chemically 
synthesized using standard methods in the art. 

The present invention also relates to the 
proteins encoded by the DNA segments of the present 

15 invention. Thus, in another embodiment, the present 
invention relates to a protein consisting 
essentially of the N-terminus of the Fo*I 
endonuclease which retains the sequence-specific 
recognition activity of the enzyme. This protein of 

20 the present invention has a molecular weight of 
about 41 kilodaltons as determined by SDS 
polyacrylamide gel electrophoresis in the presence 
of 2-mercaptbethanol. 

In a further embodiment, the present 

25 invention relates to a protein consisting 
essentially of the C-termlnus of the FoJcI 
restriction endonuclease (preferably, the C-terminal 
1/3 of the protein) . The molecular weight of this 
protein is about 25 kilodaltons as determined by SDS 

30 polyacrylamide gel electrophoresis in the presence 
of 2-mercaptoethanol. 

The proteins of the present invention can 
be isolated or purified from a biological sample 
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using methods known in the art. For example, the 
proteins can be obtained by isolating and cleaving 
the FoJcI restriction endonuclease. Alternatively, 
the proteins of the present invention can be 

-5 chemically synthesized or produced using recombinant 
DNA technology and purified. 

The DNA segments of the present invention 
can be used to generate 9 hybrid 1 restriction enzymes 
by linking other DNA binding protein domains with 

10 the nuclease domain of Fokl. This can be achieved 
chemically as well as by recombinant DNA technology. 
Such chimeric enzymes are useful for physical 
mapping and sequencing of genomes of various 
species, such as, humans, mice and plants. For 

15 example, such enzymes would be suitable for use in 
mapping the human genome. *. 

Such chimeric enzymes are also valuable 
research tools in recombinant DNA technology and 
molecular biology. Currently only 4-6 base pair 

20 cutters and a few 8 base pair cutters are available 
commercially. (There are about 10 endonucleases 
which cut >6 base pairs that are available 
commercially.) By linking other DNA binding 
proteins to the nuclease domain of FoJcI new enzymes 

25 can be generated that recognize more than 6 base 
pairs in DNA. 

Accordingly, in a further embodiment, the 
present invention relates to a DNA construct and 
the hybrid restriction enzyme encoded therein. The 

30 DNA construct of the present invention comprises a 
first DNA segment encoding the nuclease domain of 
the Fokl restriction endonuclease, a second DNA 
segment encoding a sequence-specific recognition 
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domain and a vector. The first DNA segment and the 
second DNA segment are operably linked to the vector 
so that expression of the segments can be effected 
thereby yielding a chimeric restriction enzyme. The 
5 construct can comprise regulatory elements such as 
promoters (for example, T7, tac, trp and lac DV5 
promoters), transcriptional terminators or 
retroregulators (for example, stem loops) • Host 
cells (procaryotes such as E. coll) can be 

10 transformed with the DNA constructs of the present 
invention and used for the production of chimeric 
restriction enzymes. 

The hybrid enzymes of the present 
invention comprise the nuclease domain of Fokl 

15 linked to a recognition domain of another enzyme or 
DNA binding protein (such as, naturally occurring 
DNA binding proteins that recognize 6 base pairs) • 
Suitable recognition domains include, but are not 
limited to, the recognition domains of zinc finger 

20 motifs; homeo domain motifs; other DNA binding 

protein domains of lambda repressor, lac repressor, 
cro, gal4; DNA binding protein domains of oncogenes 
such as myc, jun; and other naturally occurring 
sequence-specif ic DNA binding proteins that 

25 recognize >6 base pairs. 

The hybrid restriction enzymes of the 
present invention can be produced by those skilled 
in the art using known methodology. For example, 
the enzymes can be chemically synthesized or 

30 produced using recombinant DNA technology well known 
in the art. The hybrid enzymes of the present 
invention can be produced by culturing host cells 
(such as, HB101, SRI, RB791 and MM294) containing 



WO 94/18313 



16 

the DNA construct of the present invention and 
isolating the protein. Further, the hybrid enzymes 
can be chemically synthesized, for example; by 
linking the nuclease domain of the Fokl to the 
5 recognition domain using common linkage methods 

known in the art, for example, using protein cross- 
linking agents such as ED C/ NHS, DSP, etc. 

While the Fokl restriction endonuclease 
was the enzyme studied in the following experiments, 

10 it is expected that other Type IIS endonucleases 
(such as, those listed in Table 2) will function 
using a similar two domain structure which one 
skilled in the art could readily determine based on 
the present invention. 

15 Recently, StsI, a heteroschizomer of FoJcI 

has been isolated from Streptococcus sanguis (Kita 
et al.; fluoric Acids fteseaych 20 (3)) 618, 1992). 
StsI recognizes the same nonpalindromic 
pentadeoxyribonucleotide S'-GGATG-S 1 rS'-CATCC-S 1 as 

20 Fokl but cleaves 10/14 nucleotides downstream of the 
recognition site. The StsI RM system has been 
cloned and sequenced (Kita et al., Nucleic Acids 
Research 20 (16) 4167-72, 1992). Considerable amino 
acid sequence homology (-30%) has been detected 

25 between the endonucleases, Fokl and StsI. 

Another embodiment of the invention 
relates to the construction of two insertion mutants 
of Fokl endonuclease using the polymerase chain 
reaction (PCR) . In particular, this embodiment 

30 includes a DNA construct comprising a first DNA 

segment encoding the catalytic domain of a Type IIS 
endonuclease which contains the cleavage activity of 
the Type IIS endonuclease, a second DNA segment 
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encoding a sequence-specific recognition domain 
other than the recognition domain of the Type IIS 
endonuclease, and a third DNA segment comprising one 
or more codons. The third DNA segment is inserted 
5 between the first DNA segment and the second DNA 
segment. The construct also includes a vector. 
The Type IIS endonuclease is FoJcI restriction 
endonuclease. 

Suitable recognition domains include, but 

10 are not limited to, zinc finger motifs, homeo domain 
motifs, DNA binding domains of repressors, DNA 
binding domains of oncogenes and naturally occurring 
sequence-specific DNA binding proteins that 
recognize >6 base pairs. 

15 As noted above, the recognition domain of 

Fokl restriction endonuclease is at the amino 
terminus of Fokl endonuclease, whereas the cleavage 
domain is probably at the carboxyl terminal third of 
the molecule. It is likely that the domains are 

20 connected by a linker region, which defines the 
spacing between the recognition and the cleavage 
sites of the DNA substrate. This linker region of 
Fokl is susceptible to cleavage by trypsin in the 
presence of a DNA substrate yielding a 41-kDa amino- 

25 terminal fragment (The DNA binding domain) and a 25- 
kDa car boxy 1-terminal fragment (the cleavage 
domain) • Secondary structure prediction of FoJcI 
endonuclease based on its primary amino acid 
sequence supports this hypothesis (see Figure 10) . 

30 The predicted structure reveals a long stretch of 
alpha helix region at the junction of the 
recognition and cleavage domains. This helix 
probably constitutes the linker which connects the 
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two domains of the enzyme. Thus, it was thought 
that the cleavage distance of Foki from the 
recognition site could be altered by changing the 
length of this spacer (the alpha helix). Since 3.6 
5 amino acids are required to form one turn of the 
alpha helix, insertion of either four codons or 
seven codons in this region would extend the pre- 
existing helix in the native enzyme by one or two 
turns, respectively. Close examination of the amino 

10 acid sequence of this helix region revealed the 
presence of two KSEL repeats separated by amino 
acids EEK (Figure 10) (see SEQ ID NO: 21). The 
segments KSEL (4 codons) (see SEQ ID NO: 22) and 
KSELEEK (7 codons) (see SEQ ID NO: 23) appeared to be 

15 good choices for insertion within this helix in 
order to extend it by one and two turns, 
respectively. (See Examples X and XI.) Thus, 
genetic engineering was utilized in order to create 
mutant enzymes. 

20 In particular, the mutants are obtained by 

inserting one or more, and preferably four or seven, 
codons between the recognition and cleavage domains 
of Foki. More specifically, the four or seven 
codons are inserted at nucleotide 1152 of the gene 

25 encoding the endonuc lease. The mutants have the 
same DNA sequence specificity as the wild-type 
enzyme. However, they cleave one nucleotide further 
away from the recognition site on both strands of 
the DNA substrates as compared to the wild-type 

30 enzyme. 

Analysis of the cut sites of Foki and the 
mutants, based on the cleavage of the 100 bp 
fragment, is summarized in Figure 15. Insertion of 
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four (or seven) codons between the recognition and 
cleavage domains of FoJcI is accompanied by an 
increase in the distance of cleavage from the 
recognition site. This information further supports 
5 the presence of two separate protein domains within 
the Fokl endonuclease: one for the sequence 
specific recognition and the other for the 
endonuclease activity. The two domains are 
connected by a linker region which defines the 

10 spacing between the recognition and the cleavage 

sites of the DNA substrate. The modular structure 
of the enzyme suggests it may be feasible to 
construct chimeric endonucleases of different 
sequence specificity by linking other DNA-binding 

15 proteins to the cleavage domain of the Fokl 
endonuclease. 

In view of the above- information, another 
embodiment of the invention includes a procaryotic 
cell comprising a first DNA segment encoding the 

20 catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of the Type IIS 
endonuclease, a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of the Type IIS endonuclease, and 

25 a third DNA segment comprising one or more codons. 

The third DNA segment is inserted between the first 
DNA segment and the second DNA segment. The cell 
also includes a vector. Additionally, it should be 
noted that the first DNA segment and the second DNA 

30 segment are operably linked to the vector so that a 
single protein is produced. The third segment may 
consist essentially of four or seven codons. 
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The present invention also includes the 
protein produced by the procaryotic cell referred to 
directly above. In particular, the isolated protein 
consists essentially of the recognition domain of 
5 the Fokl restriction endonuclease, the catalytic 
domain of the Fokl restriction endonuclease, and 
amino acids encoded by the codons present in the 
third DNA segment. 

The following non-limiting Examples are 
10 provided to describe the present invention in 
greater detail. 

The following materials and methods were 
utilized in the isolation and characterization of 

15 the Fokl restriction endonuclease functional domains 
as exemplified hereinbelow. 

Bacterial strains and plasmids 
Recombinant plasmids were transformed into 
E.coli RB791 cells which carry the lac allele 

20 on the chromosome (Brent and Ptashne, PNAS USA . 

78:4204-4208, 1981) or E.coli RR1 cells. Plasmid 
pACYGfoJfcJM is a derivative of pACYC184 carrying the 
PCR-generated fokIM gene inserted into Ncol site. 
The plasmid expresses the Fokl methylase 

25 constitutively and was present in RB791 cells (or 

SRI cells) whenever the fokIR gene was introduced on 
a separate compatible plasmid. The Fokl methylase 
modifies Fokl sites and provides protection against 
chromosomal cleavage. The construction of vectors 

30 pRRS and pCB are described elsewhere (Skoglund et 
al., Gene , 88:1-5, 1990). 
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Enzymes, biochemicals and oliaos 
Oligo primers for PCR were synthesized 
with an Applied Biosystem DNA synthesizer using 
cyanoethyl phosphoramidite chemistry and purified by 
5 reversed phase HPLC. Restriction enzymes were 

purchased from New England Biolabs. The DNA ligase 
IPTG were from Boehr inger-Mannheim . PGR reagents 
were purchased as a Gene Amp Kit from Perkin-Elmer. 
Plasmid purification kit was from QIAGEN. 
10 Restriction enzyme assays 

Cells from a 5-ml sample of culture medium . 
were harvested by centrifugation, resuspended in 0.5 
ml sonication buffer [50 mM Tris.HCl (pH 8), 14mM 2- 
mercaptoethanol] , and disrupted by sonication (3x5 
i5 seconds each) on ice. The cellular debris was 

centrif uged and the crude extract used in the enzyme 
assay. Reaction mixtures (10 ^1) contained lOmM 
Tris.HCl (pH 8), 10 mM MgCl 2 , 7 mM 2-mercaptoethanol, 
50 Mg of BSA, 1 Mg of plasmid pTZ19R (U.S. 
20 biochemicals) and Ijxl of crude enzyme. Incubation 
was at 37 °C for 15 min. tRNA (10 M9) was added to 
the reaction mixtures when necessary to inhibit non- 
specific nucleases. After digestion, 
1 pi o£ dye solution (100 mM EDTA, 0.1% bromophenol 
25 blue, 0.1% xylene cyanol, 50% glycerol) was added, 

and the samples were electrophoresed on a 1% agarose 
gel • Bands were stained with 0 • 5 *xg ethidium 
bromide/ml and visualized with 3 10-nm ultraviolet 
light. 

30 gp?/pAGB 

Proteins were prepared in sample buffer 
and electrophoresed in SDS (0.1%)- polyacrylamide 
(12%) gels as described by Laemmli (Laemmli, Nature, 
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222:680-685, 1970). Proteins were stained with 
coomassie blue. 

Exawplfi I 

Cloning Pf FqKI BM system 

5 The Fokl system was cloned by selecting 

for the modification phenotype. Flavobacterium 
okeanokoites strain DNA was isolated by the method 
described by Caserta et al. (Caserta et al., 
Biol. Chem. . 262:4770-4777, 1987). Several 

10 Flavobacterium okeanokoites genome libraries were 
constructed in plasmids pBR322 and pUC 13 using the 
cloning enzymes PstI, BamEl and Bgrlll. Plasmid 
library DNA (10 /xg) was digested with 100 units of 
Fokl endonuclease to select for plasmids expressing 

15 fokIM+ phenotype. 

Surviving plasmids were transformed into 
RR1 cells and transf ormants were selected on plates 
containing appropriate antibiotic. After two rounds 
of biochemical enrichment, several plasmids 

20 expressing the fokIM+ phenotype from these libraries 
were identified. Plasmids from these clones were 
totally resistant to digestion by Fokl. 

Among eight transf ormants that were 
analyzed from the F. okeanokoites pBR322 PstJ 

25 library, two appeared to carry the fokIM gene and 
plasmids from these contained a 5.5 kb PstJ 
fragment. Among eight transf ormants that were 
picked from F. okeanokoites pBR322 BamHI library, 
two appeared to carry the fokIM gene and their 

30 plasmids contained ~ 18 kb BamHI fragment. Among 
eight transf ormants that were analyzed from the F. 
okeanokoites genome flglll library in pUC13, six 
appeared to carry the fokIM gene. Three of these 
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clones had a 8 kb Bglll insert while the rest 
contained a 16 kb Bglll fragment. 

Plating efficiency of phage a on these 
clones suggested that they also carried the fokIR 
5 gene. The clones with the 8-kb Bglll insert 

appeared to be most resistant to phage infection. 
Furthermore, the FoJcI endonuclease activity was 
detected in the crude extract of this clone after 
partial purification on a phosphocellulose column. 

10 The plasmid, pVCfokIRM from this clone was chosen 
for further characterization. 

The 5.5 kb PstJ fragment was transf erred 
to M13 phages and the nucleotide sequences of parts 
of this insert determined using Sanger • s sequencing 

15 method (Sanger et al., PNAS USA . 74:5463-5467, 

1977). The complete nucleotide sequence of the Fokl 
RM system has been published by other laboratories 
(Looney et al., Gene , 80:193-208, 1989; Kita et al., 
Nucleic Acid Res, . 17! 8741-8753 , 1989; Kita et al., 

20 J. Biol. Qieiy 264:5751-5756, 1989). 

Example XI 

Construction of an efficient overnroducer clone of 
Fokl endonuclease using polymerase chain reaction. 
The PCR technique was used to alter 

25 transcriptional and translational signals 

surrounding the fokIR gene so as to achieve 
overexpression in E.coli (Skoglund et al., Gene . 
88:1-5, 1990). The ribosome-binding site preceding 
the fokIR and fokIM genes were altered to match the 

30 consensus E. coli signal. 

In the PCR reaction, plasmid pUCtokIRM DNA 
linearized with BamRl was used as the template. PCR 
reactions (100 /il) contained 0.25 nmol of each 



primer, 50 MM of each dNTP, 10 mM Tris.HCl (pH 8.3 
at 25°C), 50 mM KC1, 1.5 mM MgCl 2 b.01% (W/V) 
gelatin, 1 ng of template DNA, 5 units of Tag DNA 
polymerase. The oligo primes used for the 
amplification of the fokIR and fokIM genes are shown 
in Figure 1. Reaction mixtures (ran in 
quadruplicate) were overlayed with mineral oil and 
reactions were carried out using Perkin-Elmer-Cetus 
Thermal Cycler. 

Initial template denaturation was 
programmed for 2 min. Thereafter, the cycle profile 
was programmed as follows: 2 min at 37 °C 
(annealing), 5 min at 72 °C (extension), and 1 min at 
94 °C (denaturation). This profile was repeated for 
25 cycles and the final 72°C extension was increased 
to 10 min. The aqueous layers of the reaction 
mixtures were pooled and extracted once with 1:1 
phenol/chloroform and twice with chloroform. The 
DNA was ethanol -precipitated and resuspended in 20 
Ml TE buffer [10 mM Tris.HCl, (pH 7.5), 1 mM EDTA] . 
The DNA was then cleaved with appropriate 
restriction enzymes to generate cohesive ends and 
gel-pur if ied. 

The construction of an over-producer clone 
was done in two steps. First, the PCR-generated DNA 
containing the fokIM gene was digested with Ncol and 
gel purified. It was then ligated into Ncol-cleaved 
and dephosphorylated pACYC184 and the recombinant 
DNA transfected into E.coli RB791 i? or RR1 cells 
made competent as described by Maniatis et al 
(Maniatis et al., Molecular Cloning. A laboratory 
manual Cold Spring Harbor Laboratory. Cold Spring 
Harbor. NY , 1982). After Tc selection, several 



25 

clones were picked and plasmid DNA was examined by 
restriction analysis for the presence of fokJM gene 
fragment in correct orientation to the 
chloramphenicol promoter of the vector (see figure 
2 ) . This plasmid expresses FoJcI methylase 
constitutively and then protects the host from 
chromosomal cleavage, when the fokIR gene is 
introduced into this host on a compatible plasmid. 

The plasmid DNA from these clones are therefore 

» 

resistant to Fokl digestion. 

Second, the PCR-generated fokIR fragment 
was ligated into BamHI -cleaved and dephosphorylated 
high expression vectors pRRS or pCB. pRRS possesses 
a lac UV5 promoter and pCB containing the strong tac 
promoter. In addition, these vectors contain the 
positive retroregulator stem-loop sequence derived 
from the crystal protein-encoding gene of Bacillus 
Thuringiensis downstream of the inserted fokIR gene. 
The recombinant DNA was transf ected into competent 
E.coli RB791 [pACYCroJcJtf] or 

RRl[pACYGfo/cIJf] cells. After Tc and Ap antibiotic 
selection, several clones were picked and plasmid 
DNA was examined by restriction analysis for fokIR 
gene fragment in correct orientation for expression 
from the vector promoters. These constructs were 
then examined for enzyme production. 

To produce the enzyme, plasmid-containing 
RB791 or RR1 cells were grown at 37 °C with 
shaking in 2x concentrated TY medium [1.6% tryptone, 
1% yeast extract, 0.5% NaCl (pH 7.2)] supplemented 
with 20 tig Tc/ml (except for the pUCfo/cXRM plasmid) 
and 50 fig Ap/ml. IPTG was added to a concentration 
of 1 mM when the cell density reached O.D.600- 0-8 !• 
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The cells were incubated overnight (12 hr) with 
shaking. As is shown in Figure 2, both constructs 
yield Fokl to a level of 5-8% of the total cellular 
protein. 

5 BaaBlsa Hi 

purifxc^tjpn of FqZI enflgiraclqaae 

A simple three-step purification procedure 
was used to obtain electrophoretically homogeneous 
FoJcI endonuc lease. RR1 [pACYGfoJcJJf , pRRSfokIR] were 

10 grown in 6L of 2 x TY containing 20/ig Tc/ml and 50 
Mg/Ap ml at 37 °c to A$oo = 0.8. and then induced 
overnight with 1 mM IPTG. The cells were harvested 
by centrifugation and then resuspended in 250 ml of 
buffer A [10 mM Tr is . phosphate (pH 8.0) , 7 mM 2- 

15 mercaptoethanol, l mM EDTA, 10% glycerol] containing 
50 mM NaCl. 

The cells were disrupted at maximum 
intensity on a Branson Sonicator for l hr at 4°C. 
The sonicated cells were centrifuged at 12,000 g for 

20 2 hr at 4°C. The supernatant was then diluted to 1L 
with buffer A containing 50 mM NaCl. The 
supernatant was loaded onto a 10 ml phosphocellulose 
(Whatman) column pre-equi libra ted with buffer A 
containing 50 mM NaCl. The column was washed with 

25 50 ml of loading buffer and the protein was eluted 

with a 80-ml total gradient of 0.05M to 0.5M NaCl in 
buffer A. The fractions were monitored by Aaso 
absorption and analyzed by electrophoresis on SDS 
(0.1%)-polyacrylamide (12%) gels (Laemmli, Nature . 

30 222:680-685, 1970). Proteins were stained with 
coomassie blue. 

Restriction endonuclease activity of the 
fractions were assayed using pTZ19R as substrate. 
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The fractions containing Fo*I were pooled and 
fractionated with ammonium sulfate. The 50-70% 
ammonium sulfate fraction contained the Fokl 
endonuclease. The precipitate was resuspended in 50 
5 til of buffer A containing 25 mH NaCl and loaded onto 
a DEAE column. FoJcI does not bind to DEAE while 
many contaminating proteins do. The flow-through 
was concentrated on a phosphocellulose column. 
Purther purification was achieved using gel 

10 filtration (AcA 44) column. The FoJcI was purified 
to electrophoretic homogeneity using this procedure. 

SDS (0.1%) polyacrylamide (12%) gel 
electrophoresis profiles of protein species present 
at each stage of purif ication are shown in Figure 3 . 

15 The sequence of the first ten amino acids of the 
purified enzyme was determined by protein 
sequencing. The determined sequence was the same as 
that predicted from the nucleotide sequence. 
Crystals of this purified enzyme have also been 

20 grown using PEG 4000 as the precipitant. Fokl 

endonuclease was purified further using AcA44 gel 
filtration column. 

Example IV 

Analysis of FokIR endonuclease by XxYPPln cleavage 
25 in the presence of DNA substrate. 

Trypsin is a serine protease and it 
cleaves at the C-terminal side of lysine and 
arginine residues. This is a very useful enzyme to 
study the domain structure of proteins and enzymes. 
30 Trypsin digestion of FoJcI in the presence of its 

substrate, d-5 1 -CCTCTGGATGCTCTC-3 1 (SEQ ID NO: 10): 
5 1 -GAGAGCATCCAGAGG-3 1 (SEQ ID NO: 11) was carried 
out with an oligonucleotide duplex to Fokl molar 
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ratio of 2.5:1. Fokl (200 fig) was incubated with 
the oligonucleotide duplex in a volume 180 |Al 
containing 10 mM Tris.HCl, 50 mM NaCl, 10% glycerol 
and 10 mM MgCl 2 at RT for 1 hr. Trypsin (20 nl, 0.2 
5 mg/ml) was added to the mixture. Aliquots (28 pi) 
from the reaction mixture were removed at different 
time intervals and quenched with excess trypsin 
inhibitor, antipain. The tryptic fragments were 
purified by reversed-phase HPLC and their N-terminus 

10 sequence determined using an automatic protein 
sequenator from Applied Biosys terns. 

The time course of trypsin digestion of 
FoJcI endonuclease in the presence of 2.5 molar 
excess of oligonucleotide substrate and 10 mM MgCl2 

15 is shown in Figure 4. At the 2.5 min time point 

only two major fragments other than the intact Fokl 
were present, a 41 kDa fragment and a 25 kDa 
fragment. Upon further trypsin digestion, the 41 
kDa fragment degraded into a 30 kDa fragment and 11 

20 kDA fragment. The 25 kDa fragment appeared to be 
resistant to any further trypsin digestion. This 
fragment appeared to be less stable if the trypsin 
digestion of Fokl - oligo complex was carried out in 
the absence of MgCl 2 . 

25 Only three major fragments (30 kDa, 25 kDa 

and 11 kDa) were present at the 160 min time point. 
Each of these fragments (41 kDa, 30 kDa, 25 kDa and 
11 kDa) was purified by reversed-phase HPLC and 
their N- terminal amino acid sequence were determined 

30 (Table I). By comparing these N- terminal sequences 
to the predicted sequence of Fokl, the 41 kDa and 25 
kDa fragments were identified as N-terminal and C- 
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terminal fragments, respectively* In addition , the 
30 kDa fragment was N-terminal. 

Example V 

Isolation of DNA binding tryptic fragments of Fokl 
5 endonuclease using oligo dT-cellulose affinity 
column . 

The DNA binding properties of the tryptic 
fragments were analyzed using an oligo dT-cellulose 
column. FoJcI (160 fig) was incubated with the 2.5 

10 molar excess oligonucleotide duplex [d-5 1 - 

CCTCTGGATGCTCTC(A) 15-3 1 (SEQ ID NO: 14) : • 
5 1 GAGAGCATCCAGAGG (A) 15-3 9 ( SEQ ID NO: 15) ] in a 
volume of 90 Ml containing 10 mM Tris.HCl (pH 8) , 50 
mM NaCI, 10% glycerol and 10 mM MgCl 2 at RT for 1 hr. 

15 Trypsin (10 pi, 0.2 mg/ml) was added to the solution 
to initiate digestion. The ratio of trypsin to Fokl 
(by weight) was 1:80. Digestion was carried out 
for 10 min to obtain predominantly 41 kDa N-terminal 
fragment and 25 kDa C-terminal fragments in the 

20 reaction mixture. The reaction was quenched with 
large excess of antipain (10 pg) and diluted in 
loading buffer [10 mM.Tris HC1 (pH 8.0), 1 mM EDTA 
and 100 mM MgCl2] to a final volume of 400 til. 

The solution was loaded onto a oligo dT- 

25 cellulose column (0.5 ml, Sigma, catalog #0-7751) 
pre-equilibrated with the loading buffer. The 
breakthrough was passed over the oligo dT-cellulose 
column six times. The column was washed with 5 ml 
of loading buffer and then eluted twice with 0.4 ml 

30 of 10 mM Tris.HCl (pH 8.0), 1 mM EDTA. These 

fractions contained the tryptic fragments that were 
bound to the oligonucleotide DNA substrate. The 
tryptic fragment bound to the oligo dT-cellulose 
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column was analyzed by SDS-polyacrylamide gel 
electrophoresis. 

In a separate reaction, the trypsin 
digestion was carried out for 160 min to obtain 
5 predominantly the 30 kDa, 25 kOa and 11 kDa 
fragments in the reaction mixture. 

Trypsin digestion of Fokl endonuc lease for 
10 min yielded the 41 kDa N-terminal fragment and 25 
kDa c-terminal fragments as the predominant species 

10 in the reaction mixture (Figure 5, Lane 3). When 

this mixture was passed over the oligo dT-cellulose 
column, only the 41 kDa N-terminal fragment is 
retained by the column suggesting that the DNA 
binding property of Fokl endonuclease is in the N- 

15 terminal 2/3 f s of the enzyme. The 25 kDa fragment 
is not retained by the oligo dT-cellulose column. 

Trypsin digestion of FoJcI - oligo complex 
for 160 min yielded predominantly the 30 kDa, 25 kDa 
and 11 kDa fragments (Figure 5, Lane 5). When this 

20 reaction mixture was passed over oligo dT-cellulose 
column, only the 30 kDa and 11 kDa fragments were 
retained. It appears these species together bind 
DNA and they arise from further degradation of 41 
kDa N-terminal fragment. The 25 kDa fragment was 

25 not retained by oligo dT-cellulose column. It also 
did not bind to DEAE and thus could be purified by 
passage through a DEAE column and recovering it in 
the breakthrough volume. 

Fokl (390 Mg) was incubated with 2.5 molar 

30 excess of oligonucleotide duplex [d-5 1 - 
CTCTGGATGCTCTC-3 (SEQ ID NO: 10) 'rS 1 - 
GAGAGCATCCAGAGG-3 1 (SEQ ID NO: 11)] in a total volume 
of 170 fil containing 10 mM Tris.HCl (pH 8) , 50 mM 
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NaCl and 10% glycerol at RT for 1 hr. Digestion 
with trypsin (30 pi; 0.2 mg/ml) in the absence of 
MgCl 2 was for 10 min at RT to maximize the yield of 
the 41 kDa N- terminal fragment. The reaction was 
5 quenched with excess antipain (200 pi) . The tryptic 
digest was passed through a DEAE column. The 25 kDa 
of C-terminal fragment was recovered in the 
breakthrough volume. All the other tryptic 
fragments (41 kDa, 30 kDa and 11 kDa) were retained 
10 by the column and were eluted with 0.5M NaCl buffer 

(3 x 200 pi). In a separate experiment, the trypsin • 
digestion of Fokl -oligo complex was done in 
presence of 10 mM MgCl 2 at RT for 60 min to maximize 
the yield of 30 kDa and 11 kDa fragments. This 
15 purified fragment cleaved non-specif ically both 

unmethylated DNA substrate (pTZ19R; Figure 6) and 
methylated DNA substrate (pACYCfoJtltf) in the 
presence of MgCl 2 . These products are small, 
indicating that it is relatively non-specific in 
20 cleavage. The products were dephosphorylated using 
calf intestinal phosphatase and rephosphorylated 
using polynucleotide kinase and [ Y - 32 P] ATP. The 
32 P-labeled products were digested to 
mononucleotides using DNase I and snake venom 
25 phosphodiesterase. Analysis of the mononucleotides 
by PEI-cellulose chromatography indicates that the 
25 kDa fragment cleaved preferentially 
phosphodiester bonds 5 1 to G>A»T~C. The 25 kDa C- 
terminal fragment thus constitutes the cleavage 
30 domain of Fokl endonuclease. 

The 41 kDa N-terminal fragment - oligo 
complex was purified by agarose gel electrophoresis. 
FoicI endonuclease (200 pg) was incubated with 2.5 
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molar excess of oligonucleotide duplex, [d-5 1 - 
CCTCTGGATGCTCTC-3 1 (SEQ ID NO: 10): 5'- 
GAGAGCATCCAGAGG-3 ■ (SEQ ID NO: 11)] in a volume of 180 
Ml containing 10 mM Tris.HCl (pH 8.0) , 50 mM NaCl 
5 and 10% glycerol at RT for 1 hr. Tracer amounts of 
32 P-labeled oligonucleotide duplex was incorporated 
into the complex to monitor it during gel 
electrophoresis. Digestion with trypsin (20 Ml; 0-2 
mg/ml) was for 12 min at RT to maximize the yield of 

10 the 41 kDa N-terminal fragment. The reaction was 
quenched with excess antipain. The 41 kDa N- 
terminal fragment • oligo complex was purified by 
agarose gel electrophoresis. The band corresponding 
to the complex was excised and recovered by 

15 electroelution in a dialysis bag 

(- 600 Ml). Analysis of the complex by SDS-PAGE 
revealed 41 kDa N- terminal fragment to be the major 
component. The 30 kDa N- terminal fragment and the 
11 kDa C-terminal fragment were present as minor 

20 components. These together appeared to bind DNA and 
co-migrate with the 41 kDa N-terminal fragment-oligo 
complex. 

The binding specificity of the 41 KDa N- 
terminal fragment was determined using gel mobility 
25 shift assays. 

Example Vj 
gel Mobility shift assays 
The specif ic oligos (d-5 1 -CCTCTGGATGCTCTC- 
3 '(SEQ ID NO: 10) and d-5 1 -GAGAGCATCCAGAGG-3 1 (SEQ 
30 ID NO: 11)) were S'-^P-labeled in a reaction 

mixtxire of 25 Ml containing 40 mM Tris.HCl (pH7. 5) 9 
20mM MgCl 2 ,50 mM NaCl r 10 mM DTT, 10 units of T4 
polynucleotide kinase (from New England Biolabs) and 
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20 MCi[ Y - 32 P] ATP (3000 Ci/mmol) . The mixture was 
incubated at 37 °C for 30 min. The kinase was 
inactivated by heating the reaction mixture to 70°C 
for 15 min. After addition of 200 /il of water, the 
5 solution was passed through Sephadex G-25 

(Superfine) column (Pharmacia) to remove the 
unreacted [ Y - 32 P] ATP. The final concentration of 
labeled single-strand oligos were 27 jiM. 

The single-strands were then annealed to 

10 form the duplex in 10 xnM Tris.HCl (pH 8.0), 50 mM 
NaCl to a concentration of 12 jiM. 1 jil of the 
solution contained - 12 picomoles of oligo duplex 
and - 50 x 10 3 cpm. The non-specific oligos (d-5 f - 
TAATTGATTCTTAA-3 • (SEQ ID NO: 12) and d-5'- 

15 ATTAAGAATCAATT-3 1 (SEQ ID NO: 13)) were labeled with 
[ Y - 32 P]ATP and polynucleotide kinase as described 
herein. The single-stranded oligos were annealed to 
yield the duplex at a concentration of 12/iM. 1 nl 
of the solution contained - 12 picomoles of oligo 

20 duplex and - 25 x io 3 cpm. The non-specific oligos 
(d-5 1 -TAATTGATTCTTAA-3 1 (SEQ ID NO: 12) and d-5 f - 
ATTAAGAATCAATT-3 • (SEQ ID NO: 13)) were labeled with 
[Y_32p«j AT p and polynucleotide Kinase as described 

herein. The single-strand oligos were annealed to 
25 yield the duplex at a concentration of 12/xM. 1 pi of 

the solution contained 42 picomdes of oligo duplex 

and -25xl0 3 cpm. 

10 til of 41 kDa N-terminal fragment-oligo 

complex (- 2 pmoles) in 10 mM Tris.HCl, 50 mM NaCl 
30 and 10 mM MgCl 2 was incubated with 1 Ml of 32 P- 

labeled specific oligonucleotide duplex (or 32 P- 

labeled non-specific oligonucleotide duplex) at 37 °C 

for 30 min and 120 min respectively. 5 m! of 75% 
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glycerol was added to each sample and loaded on a 8% 
nondenaturing polyacrylamide gel. Electrophoresis 
was at 300 volts in TBE buffer until bromophenol 
blue moved - 6 cm from the top of the gel. The gel 
5 was dried and autoradiographed. 

The complex readily exchanged 32 P-labeled 
specific oligonucleotide duplex that contained the 
Fokl recognition site as seen from the gel mobility 
shift assays (Figure 7) . It did not, however, 

10 exchange the 32 P-labeled non-specific 

oligonucleotide duplex that did not contain the Fokl 
recognition site. These results indicate that all 
the information necessary for sequence-specific 
recognition of DNA are encoded within the 41 kDa N- 

15 terminal fragment of FoJcI. 

Example VII 

Analysis of FqKI PY trypsin cleavage j,P the absence 

of ?ufr?tratgi 

A time course of trypsin digestion of FoJcI 
20 endonuclease in the absence of the DNA substrate is 
shown in Figure 8. Initially, Fokl cleaved into a 
58 kDa fragment and a 8 kDa fragment. The 58 kDa 
fragment did not bind DNA substrates and is not 
retained by the oligo dT-cellulose column. On 
25 further digestion, the 58 kDa fragment degraded into 
several intermediate tryptic fragments. However, 
the complete trypsin digestion yielded only 25 kDa 
fragments (appears as two overlapping bands). 

Each of these species (58 kDa, 25 kDa and 
30 8 kDa) were purified by reversed phase HPLC and 

their amino terminal amino acid sequence determined 
(Table I) . Comparison of the N-terminal sequences 
to the predicted FoJcI sequence revealed that the 8 
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kDa fragment to be N- terminal and the 58 kDa 
fragment to be C-terminal. This further supports 
the conclusion that N-terminus of Fokl is ' 
responsible for the recognition domain. Sequencing 
5 the N- terminus of the 25 kDa fragments revealed the 
presence of two different components. A time course 
of trypsin digestion of Fokl endonuclease in a the 
presence of a non-specific DNA substrate yielded a 
profile similar to the one obtained when trypsin 
- 10 digestion of Fokl is carried out in absence of any 
ONA substrate. ' 

Example VIII 
Cleavage specificity of the 25 kDa C-terminal 
trvptic fragment of Fokl 

15 The 25 kDa c-terminal tryptic fragment of 

Fokl cleaved pTZ19R to small products indicating 
non-specific cleavage. The degradation products 
were dephosphorylated by calf intestinal phosphatase 
and 32 P-labeled with the polynucleotide kinase and 

20 [t- 32 P]ATP. The excess label was removed using a 
Sephadex G-25 (Superfine) column. The labeled 
products were then digested with 1 unit of 
pancreatic DNase I (Boehririger-Mannheim) in buffer 
containing 50 mM Tris.HCl(pH7. 6) , lOmM MgCl 2 at 37°C 

25 for 1 hr. Then, 0.02 units of snake venom 

phosphodiesterase was added to the reaction mixture 
and digested at 37°c for 1 hr. 

Functional domains in Fokl restriction endonuclease. 
30 Analysis of functional domains of Fokl (in 

the presence and absence of substrates) using 
trypsin was summarized in Figure 9, Binding of DNA 
substrate by Fokl was accompanied by alteration in 
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the structure of the enzyme. This study supports 
that presence of two separate protein domains within 
this enzyme: one for sequence-specific recognition 
and the other for endonuc lease activity. The 
5 results indicate that the recognition domain is at 
the N- terminus of the Fokl endonuclease, while the 
cleavage domain is probably in the C-terminus third 
of the molecule. 

Examples Relating to Construction 
10 of Insertion Mutants (X-XIV) 

The complete nucleotide sequence of the 
Fokl EM system has been published by various 
laboratories (Looney et al., Gene 80: 193-208, 1989 
4 Kita et al., J, Biol.Cftefl, 264: 5751-56, 1989). 
15 Experimental protocols for PGR are described, for 
example, in Skoglund et al., Gene 88: 1-5 , 1990 and 
in Bassing et al., Gene 113:83-88, 1992. The 
procedures for cell growth and purification of the 
mutant enzymes are similar to the ones used for the 
20 wild-type FoJcI (Li et al., Proc. Nat'l. Acad. Set. 
USA 89:4275-79, 1992). Additional steps which 
include Sephadex G-75 gel filtration and Heparin- 
Sepharose CL-6B column chromatography were necessary 
to purify the mutant enzymes to homogeneity. 
25 Example X 

Mutaoensis of Svel Site at Nucleotide 162 within the 
f qHJR Gene 

The two step PGR technique used to 
mutagenize one of the Spel sites within the fokIR 
30 gene is described in Landt et al., Gene 96: 125-28, 
1990. The three synthetic primers for this protocol 
include: 1) the mutagenic primer ( 5 1 -TCATAA 
TAGCAACTAATTCTTTTTGGATCTT-3 1 ) (see SEQ ID NO: 24) 
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containing one base mismatch within the Spel site; 
2) the other primers each of which are flanked by 
restriction sites Clal ( 5 1 -CC&XC£AXATAGCCTTTTTTATT- 
3 1 ) (see SEQ ID N0:25) and Xbal (5'- 
5 GCTCTAGAGGATCCGGAGGT-3 1 ) (see SEQ ZD NO: 26), 
respectively. An intermediate fragment was 
amplified using the XJbal primer and the mutagenic 
primer during the first step. The Clal primer was 
then added to the intermediate for the second step 

10 PGR. The final 0.3 kb PGR product was digested with 
Xbal /Clal to generate cohesive ends and gel- 
purified. The expression vector (pRRSfokIR) was 
cleaved with Xbal /Clal. The large 4.2 kb fragment 
was then gel-purified and ligated to the PCR 

15 fragment. The recombinant DNA was transfected into 
competent E. coli RRl[pACYGfolcJJf] cells. After 
tetracycline and ampicillin antibiotic selection 
several clones were picked, and their plasmid DNA 
was examined by restriction analysis. The Spel site 

20 mutation was confirmed by sequencing the plasmid DNA 
using Sanger's sequencing method (Sanger et al. 
Proc. Natl. Acad. Sci. USA 74: 5463-67, 1977) . 

Example XI 

construction of four for seven) codon Insertion 
25 Mutants 

The PCR-generated DNA containing a four 
(or seven) codon insertion was digested with a 
Spel/Xmal and gel-purified. The plasmid, pERSfoklR 
from Example X was cleaved with Spel/Xmal, and the 
30 large 3.9 kb fragment was gel-purified and ligated 
to the PCR product. The recombinant DNA was 
transfected into competent RR1 [pACYCf okIM] cells, 
and the desired clones identified as described in 
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Example X. The plasmids from these clones were 
isolated and sequenced to confirm the presence of 
the four (or seven) codon insertion within the fokIR 
gene. 

5 In particular, the construction of the 

mutants was performed as follows: (1) There are 
two Spel sites at nucleotides 162 and 1152, 
respectively, within the tokIR gene sequence. The 
site at 1152 is located near the trypsin cleavage 
- 10 site of Fokl that separates the recognition and 

cleavage domains. In order to insert the four (or 
seven) codons around this region, the other Spel 
site at 162 was mutagenized using a two step PCR 
technique (Landt et al. Gene 96:125-28, 1990). 

15 Introduction of this Spel site mutation in the fokIR 
gene does not affect the expression levels of the 
overproducer clones. (2) The insertion of four (or 
seven) codons was achieved using the PCR technique. 
The mutagenic primers used in the PCR amplification 

20 are shown in Figure 11. Each primer has a 21 bp 
complementary sequence to the fokIR gene. The 5V 
end of these primers are flanked by Spel sites. The 
codons for KSEL and KSELEEK repeats are incorporated 
between the Spel site and the 21 bp complement. 

25 Degenerate codons were used in these repeats to 
circumvent potential problems during PCR 
amplification. The other primer is complementary to 
the 3 • end of the fokIR gene and is flanked by a 
Xmal site. The PCR-generated 0.6 kb fragments 

30 containing the four (or seven) codon inserts 

digested with Spel /Xmal and gel-purified. These 
fragments were substituted into the high expression 
vector pRRSrokJR to generate the mutants. Several 
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clones of each mutant identified and their DNA 
sequence confirmed by Sanger's dideoxy chain 
termination method (Sanger et al. Proc. Natl, Acad. 
Scj, USA 74.5463-67 1977). 
5 Upon induction with 1 mM isopropyl B-D- 

thiogalactoside (IPTG) , the expression of mutant 
enzymes in these clones became most prominent at 3 
hrs as determined by SDS/PAGE. This was further 
supported by the assays for the enzyme activity. 

10 The levels of expression of the mutant enzymes in 
these clones were much lower compared to the wild- 
type Fokl. IPTG induction for longer times resulted 
in lower enzyme levels indicating that the mutant 
enzymes were actively degraded within these clones. 

15 This suggests that the insertion of four (or seven) 
codons between the recognition and cleavage domains 
of Fokl destabilizes the protein conformation making 
them more susceptible to degradation within the 
cells. SDS/PAGE profiles of the mutant enzymes are 

20 shown in Figure 12. 

BBBBlg XIS 

Preparation of Dflft Substrate? y%th 3 Single FQkl 
Site 

Two substrates, each containing a single 
25 FoJcI recognition site r were prepared by PCR using 
pTZ19R as the template. Oligonucleotide primers, 
5 • -CGCAGTGTTATCACTCAT-3 1 and 5 1 - CTTGGTTG AGTACTCACC- 
3* (see SEQ ID NO:27 and SEQ ID NO:28, respectively), 
were used to synthesize the 100 bp fragment. 
30 Primers, 5 1 -ACCGAGCTCGAATTCACT-3 1 and 5 f - 

GATTTCGGCCTATTGGTT-3 1 (see SEQ ID NO: 29 and SEQ ID 
NO: 30, respectively), were used to prepare the 256 
bp fragment. Individual strands within these 
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substrates were radiolabled by using the 
corresponding ^P-labeled phosphorylated primers 
during PCR. The products were purified from low- 
melting agarose gel, ethanol precipitated and 
5 resuspended in TE buffer, 

EXSBBlfi XIII 

Analysis of the sequence Specificity Qt th? Mutant 
figzvjngs 

The agarose gel electrophoretic profile of 

10 the cleavage products of pTZl9R DNA by Fo*I and the 
mutants are shown in Figure 13A. They are very 
similar suggesting that insertion of four (or seven) 
codons in the linker region between the recognition 
and cleavage domains does not alter its DNA sequence 

15 specificity. This was further confirmed by using 
32 P- labeled DNA substrates (100 bp and 256 bp) each 
containing a single Fokl site. Substrates 
containing individual strands labeled with 32 P were 
prepared as described in Example XII. FoJcI cleaves 

20 the 256 bp substrate into two fragments, 180 bp and 
72 bp, respectively (Figure 13B) . The length of the 
fragments was calculated from the 32 P-labeled 5 r end 
of each strand. The autoradiograph of the agarose 
gel is shown in Figure 13C. Depending on which 

25 strand carries the szp-label in the substrate, either 
72 bp fragment or 180 bp fragment appears as a band 
in the autoradiograph. The mutant enzymes reveal 
identical agarose gel profiles and autoradiograph. 
Therefore, insertion of four (or seven) codons 

30 between the recognition and cleavage domains does 
not alter the DNA recognition mechanism of FoJcI 
endonuclease. 
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Example XIV 
Analysis of the Cleavage Distances ffW 
Recognition Site bv the Mutant Enzvmes 

To determine the distance of cleavage by 
the mutant enzymes, their cleavage products of the 
**P- labeled substrates were analyzed by PAGE (Figure 
14) . The digests were analyzed alongside the 
sequencing reactions of pTZ19R performed with the 
same primers used in PCR to synthesize these 
substrates. The cleavage pattern of the 100 bp 
fragment by FoicI and the mutants are shown in Figure 
14A. The cut sites sure shifted from the recognition 
site on both strands of the substrates in the case 
of the mutants, as compared to the wild-type enzyme. 
The small observable shifts between the sequencing 
gel and the cleavage products are due to the" 
unphosphorylated primers that were used in the 
sequencing reactions. 

On the 5 • -GGATG-3 1 strand, both mutants 
cut the DNA 10 nucleotides away from the site while 
on the S^CATCC-S 1 strand they cut 14 nucleotides 
away from the recognition site. These appear to be 
the major cut sites for both the mutants. A small 
amount of cleavage similar to the wild-type enzyme 
was is also observed. 

The cleavage pattern of the 256 bp 
fragment is shown in Figure 14B. The pattern of 
cleavage is shown in Figure 14B. The pattern of 
cleavage is similar to the 100 bp fragment. Some 
cleavage is seen 15 nucleotides away from the 
recognition site on the 5 , -CATCC-3 f strand in the 
case of the mutants. The multiple cut sites for the 
mutant enzymes could be attributed to the presence 
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of different conformations in these proteins. Or 
due to the increased flexibility of the spacer 
region between the two domains* Depending on the 
DNA substrate, some variation in the intensity of 
5 cleavage at these sites was observed. This may be 
due to the nucleotide sequence around these cut 
sites. Naturally occurring Type IIS enzymes with 
multiple cut sites have been reported (Szybalski et 
al., Gene 100:13-26, 1991). 



10 TABLE 1 

Amino-terminal sequences of Fokl 
fragments from trypsin digestion 



Fragment Amino-terminal sequence DNA SEQ ID 

15 substrate NO 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(1) APPLICANT: Chandrasegaran , Srinivasan 



(ii) TITLE OF INVENTION: Functional Domains in 
Fokl Restriction Endonuclease 

(iii) NUMBER OF SEQUENCES: 40 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Cushman, Darby & Cushman 

.(B) STREET: 1100 New York Ave., N.W. 

(C) CITY: Washington 

(D) STATE: D.C. 

(E) COUNTRY: USA 

(F) ZIP: 20005-3918 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Kokulis, Paul N. 

(B) REGISTRATION NUMBER: 16,773 

(C) REFERENCE/ DOCKET NUMBER: 
PNK/4130/122364/CLB 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 202-861-3503 

(B) TELEFAX: 202-822-0944 

(C) TELEX: 6714627 CUSH 
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(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID Nl 
GGATG 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID 1C 
CCTAC 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 18.. 35 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CCAT66AG6T TTAAAAT ATG A6A TTT ATT GGC AGC 

Met Arg Phe lie Gly Ser 
1 . 5 

(2) INFORMATION FOR SEQ ID NO: 4: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 

Met Arg Phe lie Gly Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ATACCATGGG AATTAAATGA CACAGCATCA 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 22.. 42 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

AGGATCCGG AGGTTTAAAA T ATG GTT TCT AAA ATA AGA ACT 42 

Met Val Ser Lys lie Arg Thr 
1 5 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

Met Val Ser Lys lie Arg Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
TAGGATCCTC ATTAAAAGTT TATCTCGCCG TTATT 35 
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(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Asn Asn Gly Glu He Asn Phe 
1 5 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
. CCTCTGGATG CTCTC 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: : 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GAGAGCATCC AGAGG 
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(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TAATTGATTC TTAA 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
ATTAAGAATC AATT 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CCTCTGGATG CTCTCAAAAA AAAAAAAAAA 
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(2) INFORMATION FOR SEQ ID NO: 15: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GAGAGCATCC AGAGGAAAAA AAAAAAAAAA 30 i 



(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



Val 
1 



Lys 

15 

Ser 



Ser Lys He Arg Thr Phe Gly Xaa Val Gin Asn Pro Gly 

5 10 
Phe Glu Asn Leu Lys Arg Val Val Gin Val Phe Asp Arg 

20 25 



(2) 



INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
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(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 17: 

scr Glu Ala Pro Cys Asp Ala lie lie Gin 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 18: 

(1) SEQUENCE CHARACTERISTICS: 

(A) . LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Gin Leu Val Lys Ser Glu Leu Glu Glu Lys 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Val Ser Lys lie Arg Thr Phe Gly Trp Val 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) strandedness: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Phe Thr Arg Val Pro Lys Arg Val Tyr 
1 5 



( 2 ) INFORMATION FOR SEQ ID NO : 21 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Glu Glu Lys 
1 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

Lys Ser Glu Leu 
1 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Lys Ser Glu Leu Glu Glu Lys 
1 5 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
TAGCAACTAA TTCTTTTTGG ATCTT 
(2) INFORMATION FOR SEQ ID NO:25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
CCATCGATAT AGCCTTTTTT ATT 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 26: 
GCTCTAGAGG ATCCGGAGGT . 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
CGCAGTGTTA TCACTCAT 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
CTTGGTTGAG TACTCACC 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
ACCGAGCTCG AATTCACT 18 
(2) INFORMATION FOR SEQ ID NO:30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

• • • ' ' i 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

GATTTCGGCC TATTGGTT 18 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 579 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

Met Val Ser Lys lie Arg Thr Phe Gly Trp Val Gin Asn Pro 

1 5 10 

Gly Lys Phe Glu Asn Leu Lys Arg Val Val Gin Val Phe Asp 
15 20 25 

Arg Asn Ser Lys Val His Asn Glu Val Lys Asn lie Lys lie 

30 35 40 

Pro Thr Leu Val Lys Glu Ser Lys lie Gin Lys Glu Leu Val 

45 50 55 

Ala lie Met Asn Gin His Asp Leu lie Tyr Thr Tyr Lys Glu 

60 65 70 

Leu Val Gly Thr Gly Thr Ser lie Arg Ser Glu Ala Pro Cys 

75 80 
Asp Ala lie lie Gin Ala Thr lie Ala Asp Gin Gly Asn Lys 
85 90 95 

Lys Gly Tyr lie Asp Asn Trp Ser Ser Asp Gly Phe Leu Arg 
100 105 110 
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Trp Ala 

Asp Ser 

Ser Ala 

Ala He 
155 

Leu Glu 
170 
Asn Leu 

Glu Gly 

Asp Lys 

Lys Tyr 
225 

Leu Val 
240 
Gly Lys 

He Thr 

Ser Thr 

Met Leu 
295 

Arg Arg 
310 
Leu Lys 

Phe Asp 

Leu He 

Tyr Gin 
365 

Arg Gly 
380 
Lys Lys 

Glu Tyr 

Asp Arg 

Val Tyr 
435 

Pro Asp 
450 



His Ala 
115 

Phe Val 
130 
Asp Gly 

Ser Ser 

Asp Gly 

Gly Phe 
185 

He Leu 
200 
Gly Glu 

Ala Arg 

Lys Gin 

Pro Asp 
255 

Gly Glu 
270 
Lys Phe 

Ala Thr 

Ala Leu 

He Glu 
325 

Glu Val 
340 
Asn Thr 

Leu Lys 

Val Thr 

Ser Glu 
395 

He Glu 
410 
He Leu 

Gly Tyr 

Gly Ala 



Leu Gly Phe He Glu Tyr 
120 

He Thr Asp Val 



Ser Ala He Glu 
145 

Tyr Pro Pro Ala 
160 

Gin His Leu Thr 
175 

Ser Gly Glu Ser 
190 

Leu Asp Thr Leu 

He Arg Asn Asn 
215 

Met lie Gly Gly 
230 

Gly Lys Lys Glu 
245 

Asn Lys Glu Phe 
260 

Gly Leu Lys Val 

Thr Arg Val Pro 
285 

Asn Leu Thr Asp 
300 

He Leu Glu He 
315 

Gin He Gin Asp 
330 

He Glu Thr He 

Gly He Phe He 
355 

Asp His He Leu 
370 

Lys Gin Leu Val 
385 

Leu Arg His Lys 
400 

Leu He Glu He 



Gly Leu 
135 

Lys Glu 
150 
He Arg 

Lys Phe 

Gly Phe 

Ala Asn 
205 

Trp Glu 
220 
Trp Leu 

Phe lie 

He Sex 

Leu Arg 
275 

Lys Arg 
290 
Lys Glu 

Leu He 

Asn Leu 

Glu Asn 
345 

Glu He 
360 
Gin Phe 

Lys Ser 

Leu Lys 



Glu Met Lys Val 
425 

Arg Gly Lys His 
440 

He Tyr Thr Val Gly Ser 
455 



Ala Arg 
415 

Met Glu 
430 
Leu Gly 



He Asn 

Ala Tyr 

He Leu 

He Leu 
165 

Asp Leu 
180 
Thr Ser 

Ala Met 

Gly Ser 

Asp Lys 
235 

He Pro 
250 
His Ala 

Arg Ala 

Val Tyr 

Tyr Val 
305 

Lys Ala 
320 
Lys Lys 

Asp He 

Lys Gly 

Val He 
375 

Glu Leu 
390 
Tyr Val 

Asn Ser 

Phe Phe 

Gly Ser 
445 

Pro He 
460 



Lys Ser 
125 

Ser Lys 
140 
lie Glu 

Thr Leu 

Gly Lys 

Leu Pro 
195 

Pro Lys 
210 
Ser Asp 

Leu Gly 
Thr Leu 



Phe Lys 
265 

Lys Gly 
280 
Trp Glu 

Arg Thr 

Gly Ser 

Leu Gly 
335 

Lys Gly 
350 
Arg Phe 

Pro Asn 

Glu Glu 

Pro His 
405 

Thr Gin 
420 
Met Lys 

Arg Lys 

Asp Tyr 
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Gly Val He Val Asp Thr Lys Ala Tyr Ser Gly Gly Tyr Asn 

465 470 475 

Leu Pro He Giy Gin Ala Asp Glu Met Gin Arg Tyr Val Glu 
480 485 490 

Glu Asn Gin Thr Arg Asn Lys His He Asn Pro Asn Glu Trp 

495 500 
Trp Lys Val Tyr Pro Ser Ser Val Thr Glu Phe Lys Phe Leu 
505 510 515 

Phe Val Ser Gly His Phe Lys Gly Asn Tyr Lys Ala Gin Leu 

520 525 530 

Thr Arg Leu Asn His He Thr Asn Cys Asn Gly Ala Val Leu 

• 535 540 545 

Ser Val Glu Glu Leu Leu He Gly Gly Glu Met He Lys Ala 
550 555 560 

Gly Thr Leu Thr Leu Glu Glu Val Arg Arg Lys Phe Asn, Asn 

565 570 
Gly Glu He Asn Phe 
575 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Lys Gin Leu Val Lys Ser Glu Leu Glu Glu Lys 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
AAGCAACTAG TCAAAAGTGA ACTGGAGGAG AAG 33 
(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Leu Val Lys Ser Glu Leu Lys Ser Glu Leu Glu Glu Lys 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
GGACTAGTCA AATCTGAACT TAAAAGTGAA CTGGAGGAGA AG 42 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Leu Val Lys Ser Glu Leu Glu Glu Lys Lys Ser Glu Leu Glu 

1 5 10 

Glu Lys 
15 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
GGACTAGTCA AATCTGAACT TGAGGAGAAG AAAAGTGAAC 

TGGAGG AGAA G 51 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Asn Phe Xaa Xaa 
1 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
TTGAAAATTA CTCCTAGGGG CCCCCCT 
(2) INFORMATION FOR SEQ ID NO:40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40: 
GGATGNNNNNNNNNNNNNNNNNN 
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All publications mentioned hereinabove are 
hereby incorporated by reference. 

While the foregoing invention has been 
described in some detail for purposes of clarity and 
understanding, it will be appreciated by one skilled 
in the art that various changes in form and detail 
can be made without departing from the true scope of 
the invention* 
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WHAT IS CLAIMED IS: 

1. An isolated DNA segment encoding the 
recognition domain of a Type IIS endonuclease which 
contains the sequence-specific recognition activity 
of said Type IIS endonuclease. 

2. The DNA segment of claim 1 wherein 
said Type IIS endonuclease is Fokl restriction 
endonuclease. 

3 . The DNA segment of claim 2 which 
encodes amino acids 1-382 of the Fokl restriction 
endonuclease. 

4. An isolated DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of said Type IIS 
endonuclease. 

5. The DNA segment of claim 4 wherein 
said Type IIS endonuclease is Fokl restriction 
endonuclease. 

6. The DNA segment of claim 5 which 
encodes amino acids 383-578 of the Fokl restriction 
endonuclease. 

7. An isolated protein consisting 
essentially of the N-terminus of the Fokl 
restriction endonuclease which protein has the 
sequence-specif ic recognition activity of said 
endonuclease. 
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8. An isolated protein consisting 
essentially of the C-terminus of the Fokl 
restriction endonuclease which protein has the 
cleavage activity of said endonuclease. 

9. A DNA construct comprising: 

(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of said Type IIS 
endonuclease; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 
and ' 

(iii) a vector 

wherein said first DNA segment and said 
second DNA segment are operably linked to said 
vector so that a single protein is produced. 

10. The DNA construct according to claim 

9 wherein said Type IIS endonuclease is Fokl 
restriction endonuclease. 

11. The DNA construct according to claim 

10 wherein said recognition domain is selected from 
the group consisting of: zinc finger motif is, homeo 
domain motifs, DNA binding domains of repressors, 
DNA binding domains of oncogenes and naturally 
occurring sequence-specific DNA binding proteins 
that recognize >6 base pairs. 
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12. A procaryotic cell comprising: 

(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease 'which 
contains the cleavage activity of said Type IIS 
endonuclease; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 
and 

(iii) a vector 
wherein said first DNA segment and said second DNA 
segment are operably linked to said vector so that a 
single protein is produced. 

13. A hybrid restriction enzyme 
comprising the catalytic domain of a Type IIS 
endonuclease which contains the cleavage activity of 
said Type IIS endonuclease cova lent ly linked to a 
recognition domain of a protein other than said Type 
IIS endonuclease. 

14 . The hybrid restriction enzyme of 
claim 13 wherein said recognition domain which 
comprises part of said hybrid restriction enzyme is 
selected from the group consisting of : zinc finger 
motifs, homeo domain motifs, DNA binding domains of 
repressors, DNA binding domains of oncogenes and 
naturally occurring sequence-specific DNA binding 
proteins that recognize >6 base pairs. 

15. A DNA construct comprising: 
(i) a first DNA segment encoding the 

catalytic domain of a Type IIS endonuclease which 
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contains the cleavage activity of said Type IIS 
endonuclease; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 

(lii) a third DNA segment comprising one 
or more codons, wherein said third DNA segment is 
inserted between said first DNA segment and said 
second DNA segment; and 

(iv) a vector 

wherein said first DNA segment, said 
second DNA segment and said third DNA segment are 
operably linked to said vector so that a single 
protein is produced* 

16 • The DNA construct according to claim 

15 wherein said Type IIS endonuclease is Fokl 
restriction endonuclease. 

17 . The DNA construct according to claim 

16 wherein said third DNA segment consists 
essentially of four codons. 

18. The DNA construct according to claim 

17 wherein said four codons of said third DNA 
segment are inserted at nucleotide 1152 of the gene 
encoding said endonuclease. 

19. The DNA construct according to claim 
16 wherein said third DNA segment consists 
essentially of 7 codons. 
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20. The DNA construct according to claim 
19 wherein said 7 codons of said third DNA segment 
are inserted at nucleotide 1152 of the gene encoding 
said endonuclease. 

21. The DNA construct according to claim 
16 wherein said recognition domain is selected from 
the group consisting of: zinc finger motifs, homeo 
domain motifs, DNA binding domains of repressors, 
DNA binding domains of oncogenes and naturally 
occurring sequence-specific DNA binding proteins 
that recognize >6 base pairs. 

22. A procaryotic cell comprising: 

(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of said Type IIS 
endonuclease; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 

(iii) a third DNA segment comprising one 
or more codons, wherein said third DNA segment is 
inserted between said first DNA segment and said 
second DNA segment; and 

(iv) a vector 

wherein said first DNA segment, said 
second DNA segment, and said third DNA segment are 
operably linked to said vector so that a single 
protein is produced. 
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23. The procaryotic cell of claim 22 
wherein said third DNA segment consists essentially 
of four codons. 

24 . The procaryotic cell of claim 22 
wherein said third DNA segment consists essentially 
of seven codons. 

i 

25. An isolated protein produced by the 
procaryotic cell of claim 22. 

26. An isolated DNA segment encoding the 
N-terminus of a Type IIS endonuclease which contains 
the sequence-specific recognition activity of said 
Type II endonuclease, said Type II endonuclease 
being Fokl restriction endonuclease and having a 
molecular weight of about 41 kilodaltons as measured 
by SDS-polyacrylamide gel electrophoresis. 

27. An isolated DNA segment encoding the 
C-terminus of a Type IIS endonuclease which contains 
the cleavage activity of said Type IIS endonuclease, 
said Type II endonuclease being Fokl restriction 
endonuclease and having a molecular weight of about 
25 kilodaltons as determined by SDS-polyacrylamide 
gel electrophoresis. 

28. An isolated protein consisting 
essentially of the N-terminus of the Fok restriction 
endonuclease which protein has the sequence-specific 
recognition activity of said endonuclease and which 
protein is amino acids 1-382 of said Fok restriction 
endonuclease. 
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29. An isolated protein consisting 
essentially of the C-terminus of the Fokl 
restriction endonuclease which protein has the 
nuclease activity of said endonuclease and which 
protein is amino acids 383-578 of said Fo/cl 
restriction endonuclease. 
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AMENDED CLAIMS 
[received by the International Bureau 
11 July 1994 (11.07.94); original claims 1-8 replaced by new 
claims 1 and 2; original claims 9-25 and 28,29 renumbered. 
as new claims 3-21 and 22,23; original -claims 
26 and 21 cancelled (6. pages)] 

1. An isolated DNA segment encoding the 
N- terminus of a Type IIS endonuclease which contains 
the sequence-specific recognition activity of said 
Type IIS endonuclease, said Type IIS endonuclease 
being Fokl restriction endonuclease and said N- 
terminus having a molecular weight of about 41 
kilodaltons as determined by SDS-polyacrylamide gel 
electrophoresis wherein said isolated DNA segment 
encodes amino acids 1-382 of said Fo*I restriction 
endonuclease. 

2. An isolated DNA segment encoding the 
C-terminus of a Type IIS endonuclease which contains 
the cleavage activity of said Type IIS endonuclease, 
said Type IIS endonuclease being Fokl and said C- 
terminus having a molecular weight of about 25 
kilodaltons, as determined by SDS-polyacrylamide gel 
electrophoresis, wherein said isolated DNA segment 
encodes amino acids 383-578 of said Fokl restriction 
endonuclease. 

3. A DNA construct comprising: 

(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of said Type IIS 
endonuclease; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 
and 

(iii) a vector 
AMENDED SHEET (ARTICLE 19) 
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wherein said first DNA segment and said 
second DNA segment are operably linked to said 
vector so that a single protein is produced 1 

4. The DNA construct according to claim 3 
wherein said Type IIS endonuclease is Fokl 
restriction endonuclease, 

5. The DNA construct according to claim 4 
wherein said recognition domain is selected from the 
group consisting of: zinc finger motifs, homeo 
domain motifs, DNA binding domains of repressors, 
DNA binding domains of oncogenes and naturally 
occurring sequence-specific DNA binding proteins 
that recognize >6 base pairs. 

6. A procaryotic cell comprising: 

(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of said Type IIS 
endonuclease; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 
and 

(iii) a vector 

wherein said first DNA segment and said second DNA 
segment are operably linked to said vector so that a 
single protein is produced. 

7. A hybrid restriction enzyme comprising 
the catalytic domain of a Type IIS endonuclease 
which contains the cleavage activity of said Type 
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IIS endonuclease covalently linked to a recognition 
domain of a protein other than said Type IIS 
endonuclease. 

8 • The hybrid restriction enzyme of claim 
7 wherein said recognition domain which comprises 
part of said hybrid restriction enzyme is selected 
from the group consisting of: zinc finger motif s, 
homeo domain motifs, DNA binding domains of 
repressors, DNA binding domains of oncogenes and 
naturally occurring sequence-specific DNA binding 
proteins that recognize >6 base pairs. 

9. A DNA construct comprising: 

(i) a first DNA segment encoding the 
catalytic domain of a Type IIS endonuclease which 
contains the cleavage activity of said Type IIS 
endonuclease; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the 
recognition domain of said Type IIS endonuclease; 

(iii) a third DNA segment comprising one 
or more codons, wherein said third DNA segment is 
inserted between said first DNA segment and said 
second DNA segment; and 

(iv) a vector 

wherein said first DNA segment, said 
second DNA segment and said third DNA segment are 
operably linked to said vector so that a single 
protein is produced. 
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10 i The DNA construct according to claim 

9 wherein said Type IIS endonuclease is Fokl 

^ 

restriction endonuclease* 

11. The DNA construct according to claim 

10 wherein said third DNA segment consists 
essentially of four codons. 

12. The DNA construct according to claim 

11 wherein said four codons of said third DNA 
segment are inserted at nucleotide 1152 of the gene 
encoding said endonuclease. 

13. The DNA construct according to claim 

10 wherein said third DNA segment consists 
-. ■ f 

essentially of 7 codons. 

14. The DNA construct according to claim 
13 wherein said 7 codons of said third DNA segment 
are inserted at nucleotide 1152 of the gene encoding 
said endonuclease. 

15. The DNA construct according to claim 
10 wherein said recognition domain is selected from 
the group consisting of: zinc finger motifs, homeo 
domain motifs, DNA binding domains of repressors, 
DNA binding domains of oncogenes and naturally 
occurring sequence-specific DNA binding proteins 
that recognize >6 base pairs. 

16. A procaryotic cell comprising: 
(i) a first DNA segment encoding the 

catalytic domain of a Type IIS endonuclease which 
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t 

contains the cleavage activity of said Type IIS 
endonuclease; 

(ii) a second DNA segment encoding a 
sequence-specific recognition domain other than the . 
recognition domain of said Type IIS endonuclease; 

x (iii) a third DNA segment comprising one 
or more codons > wherein said third DNA segment is 
inserted between said first DNA segment and said 
second DNA segment; and 
(iv) a vector 

wherein said first DNA segment, said 
second DNA segment, and said third DNA segment are 
operably linked to said vector so that a single 
protein is produced. 

17. The procaryotic cell of claim 16 
wherein said third DNA segment consists essentially 
of four codons. 

18. The procaryotic cell of claim 16 
wherein said third DNA segment consists essentially 
of seven codons. 

19. An isolated hybrid Type IIS 
endonuclease produced by the procaryotic cell of 
claim 16. 

20. An isolated DNA segment encoding the 
N-terminus of a Type IIS endonuclease which contains 
the sequence-specific recognition activity of said 
Type II endonuclease, said Type II endonuclease 
being Fokl restriction endonuclease and having a 
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molecular weight of about 41 kilodaltons as measured 
by SDS-polyacrylamide gel electrophoresis. 

21. An isolated DNA segment encoding the 
C-terminus of a Type IIS endonuclease which contains 
the cleavage activity of said Type IIS endonuclease, 
said Type II endonuclease being Fokl restriction 
endonuclease and having a molecular weight of about 
25 kilodaltons as determined by SDS-polyacrylamide 
gel electrophoresis. 

22. An isolated protein consisting 
essentially of the N-terminus of the Fok restriction 
endonuclease which protein has the sequence-specific 
recognition activity of said endonuclease and which 
protein is amino acids 1-382 of said Fok restriction 
endonuclease. 

23. An isolated protein consisting 
essentially of the C-terminus of the Fokl 
restriction endonuclease which protein has the 
nuclease activity of said endonuclease and which 
protein is amino acids 383-578 of said Fokl 
restriction endonuclease. 
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Figure 1 



FokIM 
5' primer 

Ncol 7-bp spacer 

5' TA CCA TGG AGGT TTAAAAT ATG AGA TTT ATT GGC AGC 
RBS Met Arg Phe De Gly Ser 



3' primer 



' 18-bp complement Ncol 

3' ACT ACG ACA CAG TAA ATT AAG GGTACC ATA 5' 



FoklR 
5' primer 



BamHl RBS 7-bp spacer 
5' TA GGATCC GGAGGT TTAAAAT ATG GTT TCT AAA ATA AGA ACT 

Met VaJ Ser Lys He Arg Thr 



3' primer 



Complementary Strand _ BamHl 

3' TTA TIG CCG CTC TAT TTG AAA ATT ACT CC TAGG AT 5' 
Asn Asn Gly Glu lie Asn Phe 
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FIGURE 2 
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FIG. 4 
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FIG.6B 
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FIG. 8 
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FIGURE 9 
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FIGURE 10 
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Figure 11 
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FIG. 12 
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FIG. I3A 
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FIG. 13 B 
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FIG. 13 C 
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FIG. I4A 
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FIG. I5A 

(A) wild-type FoH 

\ 

5'- GGATGNNNNNNNNNNNNNNNNNN -3' 
3'- CCTACNNNNlSnSfNNNNNNNNNNNN-5' 

i 
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FIG. I5B 



(B) 4-codon insertion mutant - 

v. H 

5'- GGATG NNNNNNNNNNNNNNNNNN -3' 
3 - CCTACNNNNNNNNNNNNNNNNNN -5' 
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FIG. I5C 

(C) 7-codon insertion mutant 

H 

5'- GGATGNNNNNNNNNNNhWMsn^W -3' 
3'- CCT AC NNNNNNNNNNNNNNNNNN -5' 
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